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PF?™^fT^iiS^^?J^ ^^^^^ TRANSFORM-BASED MOTION 
RESIDUAL FRAME CODING METHOD AND APPARATUS FOR 

VIDEO COMPRESSION 



FIELD OF THE INVENTION 

5 The present invention pertains to the field of compression and in particular to video 
compression methods and apparatuses. 

BACKGROUND 

A sequence of pictures can occupy a vast amount of storage space and require veiy high 
transmission bandwidth when represented in an uncompressed digital fonn. Point to 
point digital video communication became practicable several yeare ago following 
advances in computer networks and signal compression technology. 
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The standardization effort for digital video compression was initiated in approximately 
1988. Currently, the Moving Picture Experts Group (MPEG) committee under ISO/IEC 
15 has completed both the MPEG-1 and the MPEG-2 standards; the MPEG-4 standard has 
also been completed, but new proposals are still being accepted. In addition. CCriT 
developed a series of recommendations - H.261, H.263 and H.263+ - that focus on low 
bit rate applications. All of these attempts at standardization utilize a two-step 
procedure to compress a video sequence. The first step uses a motion estimation and 

20 compensation algorithm to create a predicted video frame for the current video frame 
using the previous video frame, wherein the difference between the current video frame 
and the predicted video frame is computed and is called the motion residual picture 
(MRP). The second step in the standard procedure is to code the MRP usmg the 
Discrete Cosine Transform (DCT). Such DCT-based systems do not perform well m all 

25 circumstances. At the low bit rates needed for personal video communication, DCT- 
based systems cause noticeable distortion and visible block artifacts. For high visual 
quality applications, such as DVD, the compression ratio achieved can be quite low. 

Motion residual pictures can be coded using other transform-based techniques. For 
30 example, discrete wavelet transforms (DWT) and overcomplete basis transforms can 
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also be used. Zakhor and Neff presented a motion i«sidual coding system in U.S. Patent 
No. 5,699,121 based on an overcomplete basis transfonn algorithm called matching 
pursuit. This was first proposed by Mallat and Zhang m IEEE Transaction in Signal 
Processing, vol. 41, No. 12, Dec. 1993. Zakhor and NefFs video coder improves both 
5 the visual quality and the PNSR over standard DCT-based video codere. However, their 
system is very slow and the compression performance is not optimized due to an ad-hoc 
design for matched basis position codmg and quantization of the transform coefficients. 
Therefore there is a need for a new overcomplete transfonn based video coding 
technique that can provide both speed and efficiency. 
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This background information is provided for the purpose of makmg known information 
believed by the applicant to be of possible relevance to the present invention. No 
admission is necessarUy intended, nor should be construed, that any of the preceding 
information constitutes prior art against the present mvention. 

1 5 SUMMARY OF THE INVENTION 

An object of the present mvention is to provide a overwmplete basis transform-based 
motion residual fiame coding method and apparatus for video compression. In 
accordance with an aspect of the present invention, there is provided a mefliod for 
encoding a residual unage usmg basis functions from an overcomplete library, said 
method comprising the steps of: obtaining the residual image, said residual unage 
having a size and an energy, and decomposmg said residual image into a list of one or 
more atoms, each atom representing a basis function from the overcomplete library, said 
step of decomposing said residual image includmg the steps o£ (i) identifying a 
replacement region in the residual unage for representation by an atom using a residual 
energy segmentation algorithm; (ii) creating a subset of basis functions from the 
overcomplete library, each basis firaction in the subset matchmg witii the replacement 
region withm a predetermined tiireshold; (iii) identifymg an atom witiiin flie subset of 
basis functions, said atom for representing the replacement region and said atom having 
parameters; (iv) quantizing said atom and modifying tiie parameters of the atom mto a 
form suited for encoding; (v) encoding said quantized atom, subtracting said atom from 
the replacement region in the residual rniage thereby reducing the energy of the residual 
image and using a quadto-ee-based atom coder to reduce tiie size of tiie residual image; 
and (vi) comparing the reduced size of the residual image or the reduced energy of tiie 
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residual image with a predetennined criteria and repeating steps (i) to (vi) until the 
predetennined criteria is achieved; theieby encoding said residual image and reducing 
the size thereof to a predetennined level. 

5 ^--ordancewithanotheraspectofthepresentinventionthereisprovidedanap 

for encoding a residual image using basis fonctions fiom an overcomplete libraiy said 
apparatus comprising: means for obtaining the residual image, said residual image 
havmg a size and an energy; and means for decomposing said residual image into a list 
of one or more atoms, each atom representing a basis function fix,m the overeomplete 
a hbraiy. said means for decomposing said residual image including: (i) means for 
Identifying a replacement region in the residual image for representation by an atom 
using a residual energy segmentation algorithm; (ii) means for creating a subset of basis 
functions from the overeomplete library, each basis fimction in the subset matching with 
the replacement region within a predetermined threshold; (iii) means for identifying an 
. atom within the subset of basis fonctions. said atom for representing the replacement 
region and said atom having parameters; (iv) means for quantizing said atom and 
modifying the parameters of the atom into a fonn suited for encoding;(v) means for 
encoding said quantized atom, subtracting said atom fiom the replacement region in the 
residual image thereby reducing the energy of the residual image and using a quadtree 
based atom coder to reduce the size of the residual image; and (vi) means for comparing 
the reduced size of the residual image or the reduced energy of the residual image with a 
predetennined criteria; thereby encoding said residual image and reducing the size 
thereof to a predetermined level. 

m accordance with another aspect of the present invention there is provided a computer 
pregiam product comprising a computer readable medium having a computer program 
recorded thereon for perfonning a method for encoding a residual image using basis 
functions from an overeomplete library comprising the steps ofi obtaining the residual 
unage. said residual unage havmg a size and an eneig,^. and decomposing said residual 
unage mto a list of one or more atoms, each atom representing a basis function from the 
overeomplete librao^. said step of decomposing said residual image including the steps 
ot (1) Identifying a replacement region in the residual image for representation by an 
atom using a residual energy segmentation algorithm; (ii) creating a subset of basis 
functions from the overeomplete library, each basis fimction in the subset matching with 
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the replacement region within a predetermined threshold; (iii) identifying an atom within 
the subset of basis functions, said atom for representing the replacement region and said 
atom having parameters; (iv) quantizing said atom and modifying the pai^eteis of the 
atom into a form suited for encoding; (v) encoding said quantized atom, subtracting said 
5 atom from the replacement region in the residual image thereby reducing the energy of 
the residual image and using a quadtree-based atom coder to reduce the size of the 
residual image; and (vi) comparing the reduced size of the residual image or the reduced 
energy of the residual image with a predetermined criteria and repeating steps (i) to (vi) 
until the predetermined criteria is achieved; thereby encoding said residual image and 
reducing the size thereof to a predetermined level. 
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BRIEF DESCRIPTION OF THE FIGURES 
Figure 1 illustrates the overall diagram of video compression systems that use the over- 
complete basis transform and associated coding methods according to one embodiment 
of the present invention. 

Figure 2 is an example of a motion residual image processed by one embodiment of the 
present invention. 

Figure 3 illustrates a simple dictionary with 16 bases for use with one embodiment of 
20 the present invention. 

Figure 4 describes the whole atom decomposition process based on over-complete basis 
according to one embodiment of fte present invention. 

25 Figure 5 describes the basic steps executed by the residual energy segmentation 
algorithm (RESA) accordmg to one embodiment of the present invention. 

Figure 6 iUustrates the first step of RESA acconling to one embodiment of the 
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present 
invention. 



Figure 7 illustrates the second step of RESA: the horizontal growing scheme, according 
to one embodunent of the present invention. 
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Figure 8 illustrates the third step of RESA: the vertical growing scheme, according to 
one embodiment of the present invention. 

Figure 9 describes the matching pursuit atom search using the progi^ssive elimination 
5 algorithm (PEA) according to one embodiment of the present invention. 

Figure 10 illustrates how to form the sub dictionaiy of matching basis and sean^hing 
position candidates according to one embodiment of the present invention. 

10 Figure 11 illustrates the fest calculation of region energy according to one embodiment 
of the present invention. 



Figure 12 illustrates the parameters for one atom according to one embodiment of the 
present invention. 
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Figure 13 is an example of an atom position map according to one embodiment of the 
present invention. 

Figure 14 is a flowchart illustrating the atom encodmg process according to one 
20 embodiment of the present invention. 

Figure 15 is a flowchart illustrating the decoding of a compressed residual signal 
according to one embodiment of the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 
The current invention is an new coder for oveieomplete-transfomi based residual picture 
coding, used for motion compensated video compression systems. TOs invention is 
analogous to previous matching pursuit video coders in that they decompose the residual 
image into a list of atoms, which represent basis functions from an overxx)mplete 
dictionaiy. The atom finding process, however, is performed using a Residual Energy 
Segmentation Algorithm (RESA) and a Progressive Elimination Algorithm (PEA). The 
basis dictionaiy can be very large in onier to char^terize the features appearing 
frequently in motion residual images. To find an atom, RESA identifies the 
approximate shape and position of regions with high energy in the motion lesidual 
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taag« .ud, fta. a good n,at«h can be found by co„v=*on wW. a amaUer subset of 
b«es w,to tt.e dicHona^. p^^„^ ^^^^ 

c»d.da.es from consideration by pre-con-puting tl,e energy of search windows, the^ 

5 iTrf VT'"*"'""" "^'^ fi^d «■« -tch. Whenever a ntatched 
5 atom .s found, fl.e tesidual image is updated by .emoving the p« chanicterized by flte 

atom The foregoing steps of finding atoms and updattag residual images are repeated 

until the desired compression bit rate or quaUty has been achieved. 

The invention introduces a new modulus quantization scheme fcr matching pur^i, wift 
) an ove,«>mpIete basis, that changes the atom flndbg p„>cedure. Ite coefficients 
produced duectiy from «,e transform are con«nuous floa«ng.point values, which ,«p,ire 
quantrzatron for optimal digital coding under a bh budget In the matchtog putauit 
algonthm, i, is necessary to use an in-loop quantizer - where each fo^rd atom is first 
qu««ized. and flrcn u^ to update the residual image. As such each atom afficts fl« 
selectron of subsequent au>ms. If the quantizer is specified bef«e codtog begins, as m 
prevrous mau=hing pursuit methods, it is difficult to opthnize the quantizaflon scheme as 
Ae optrmal quantizer design depends on st^Istics of the list of chosen atom moduli 
Th. quantization scheme according to the present tovention chooses the quantized 
adaptively durmg the atom searchmg process. 

m addition to the atom modulus, the tadet of the chos«, basis and a>e position of the 
atoms need to be transmitted in an overcomplete^ansfcnn based coder. 1^ invention 
■ncludes a meUrod to code the atom position taformadon efficlenfly. Tie «om posidon 
dtstrrbution forms a 2D map, where pixel values of one and zero represent ^ pres«K» 
of atoms or lack thereof in each position respectively. A quadtree like technique enables 
codmg of tiie position map. The modulus and basis index mfcrmation are embedded In 
the position coding. The atoms for different chamKMs of color video (Y, U, V) are coded 
independently. 
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All atom parameters are transmitted rfter they have been encoded toto a compressed 

version of the residual hnagea. For a» decoding process «,e decoder reconstructs the 
resrdua. rmage tirrough mterpreting the coded bit stieam back into atom parameter, and 
combming tiie atom mfi,rm«Ion fi,rm the reconstructed stream of residual tarages that 
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a^n combine, ^ ^ ^ ^^^^ ^ 

of. fo™^,„g fte atom decomp^ftion of ft. ,.,i<u,„ >^ ^ overoomplce 

LT " « *« ""^ S,' *e selected 

b.. T.e present „ven.ion fU^er provide . for deocdin, r^^J^^ 

fl»t have been encoded „stags,e,b„„^ji„^„^ -"al signals 

Figure 1 illustrates the associated piocessimt executed l™ 

.p^.s.thate.p.o..he.i<^Lagecirr:^:::i3:^^^^^^ 

*e pce^en. ^e Wdeo fian» Is inida,,, p^cessed a motion estinToTs 

which compares the cuixent fiame with one or two lefe™,,.. ft , 

-^--»- Video Chan, their posHlonh-JTrrrrh^ th:": 
«s t e sante. Since the te.^ fian«s have been transn^ted tot^rj 
decoder U. sonte .gions h, the refaence fia..e can be used to consttuc, the cu^ 
Same The.no«o„ estimators* identiflesthoseregio„swi.hintherefe,e„ceL^^^^ 
20 Z r T'^ ^ motion compensator 

v3' - - motion 

vectors, wi^ch ate processed by the motion vector et^oder 34. The atom decompos^MO 

~».««ua,h„.gefl.«.and.hentheatome„coder«comp,^es.h::^^^ 
atoms. The coded motion vccto,. and atoms are combined into one bit stream bv thi 

It'^^f '^'""'■'"^^'^"'^'---''^--'^•'^^^whl 
can deliver the video ta compressed ferma, to the video decoder 12. 

^o^r pan of Figure , ilht^ates the decoder U. m which d,e demultiplexer 26 

>0 dtr 3^ T f '"'^^ """" ^'"^ -or 

aecoaer 36 and the residual image decoder 2ft r^c« i 

i«ec uccoaer zs, respectively. The motion 

vecto^. Tie residual m«ge decoder 28 reconsmicts the residual image. TTiese two 
■gnaH namely the predic«on »ame and the residual t^e are adl to^elrt 
goierate the final reconsttroted video fiame. 
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Figure 2 is an example motion residual image for the Y colour channel. The original 
residual image has both negative and positive values. For proper displaying of the 
residual image as a 256 level gray image, the pixel values in the residual image are 
shifted and scaled so that pure gray means zero, while black and white represent 
negative and positive values, respectively. For example, the residual image comprises 
several high-energy regions, which correspond to the motion of objects in the video. 

Most signal compression techniques transform the original data mto some more compact 
format through different kinds mathematical transformations. Some mathematical 
transforms, such as DCT and DWT, use a complete basis, which forms an invertible 
transformation matrix. Recently, overcomplete basis and associated transformation 
algorithms have received considerable attention. The number of bases in an 
overcomplete basis dictionary is much larger than the dimension of the original data. 
Hie benefit of an overcomplete basis is that the transformed coefficients are more 
effective in representing the true features in the original signal. There exist many 
mathematical methods to build a basis dictionary for different signals. Several 
dictionaries for video motion residual pictures have been designed and have been proven 
to cover the features in residual pictures weU. For example, a basis dictionary based on 
separable Gabor functions has been described by Nefif and Zakhor in "Very Low Bit 
Rate Video Coding Based on Matching Pursuits". IEEE Transactions on Circuits and 
Systems for Video Technology, Feb. 1997. 158-171, and a basis dictionary based on 
Haar functions has been described by Vleeschouwer and Macq in "New dictionaries for 
matching pursuit video coding", Proc. of the 1998 Ihtemational Conference on Image 
Processing, vol. 1. 764-768. Figure 3 is a srniple example dictionaiy containing 
16 bases. Any of the above dictionaries can be used witii the present invention. Havmg 
particular regard to the above-mentioned Gabor dictionaiy. tiiere are 400 2D functions 
explicitly mentioned. However, it actually includes many more basis structures 
implicitly since each of tiiose 400 2D functions can be placed at every possible position 
witiiin the image. Using a fimie size of 176X144 pixels implies tiiat the dictionaiy 
actually contains 400X176X144=5.7 million bases structures - which makes it highly 
overcomplete. The transformation directiy usmg the "matching pursuit algoritimi" 
described by S. Mallat and Z. Zhang in "Matching Pursuits Witii Time-Frequency 
Dictionaries". IEEE Transaction in Signal Processing, vol. 41, No. 12. Dec. 1993. wUl 
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take an extremely large number of computations to detennine the tninsfonn coefficients 
The matching pursuit for video compression, invented by Zakhor and Neff in US Patent 
No. 5,699,121, reduces the calculation burden, howev^ it remains computationally 
expensive. The present invention provides a way to ti^sfonn residual images based on 
5 general dictionaries, which is performed by the atom decomposer 40. and a way to code 
the transformed coefficients, which is the task of the atom encoder 42. 

The operation of the atom decomposer 40 is folly described in Figure 4. according to 
one embodiment. The first step (block 61) executed by the atom decomposer 40 is to 
find the initial search region. This step is raized by the residual energy segmentation 
algorithm (RESA), wherein one embodiment thereof is shown in Figure 5. RESA is 
based on a general region growing idea. It initially selects a 2x2 block as a startmg point 
for region growing (block 70). This step require the division of the residual unage into 
16x16 blocks, as shown in Figure 6. THe energy, which is Ihe sum of the square of all 
pixel mtensities, is computed for each block, and the block with the highest energy is 
Identified as block 71 shown in Figure. 6. for example. Block 71 is further divided into 
four 8x8 sub-blocks, and the sub-block 72 with the highest energy is identified. Within 
that 8x8 sub-block 72, the highest energy 2x2 block 73 is also identified, wherein this 
block will be used as the starting point for region growing. 

The next step of RESA (block 74 illustrated m Figure 5) is to check the 2x2 block in the 
left side of the current region. Figure 7 iUustmtes this step of RESA. A thieshold is 
calculated dynamically as: 
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T = AE*max(7-AU, 5)/l0 

>vhere AU is the number of blocks that have been added on the lefl side of the start 
block, and AE is the average energy per 2x2 block of the current region. If the enei^y of 
the checked 2x2 block is larger than the current threshold, the tested 2x2 block is 
grouped with the current region, together fonning a new larger current region 
Otherwise, a stop point has been found on this side, and we do not group the blocks 
together, m a similar, symmetric fashion, check the 2x2 block on the right side of the 
current region. Continue growing first the left side and then the right side, until stop 
points are found on both sides or the width of the rectangle has reached 32. (whichever 
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comes first). A horizontal strip rectangle 75 is formed after this step, wherein the 
dimension of the strip is 2*2m, l<=m<=16. 



5 



The final step of RES A (block 76 in Figure 5) is to grow the region vertically based on 
strip 75, as shown in Figure. 8. Assume the width of the strip 75 is W. Consider the 
2* W strip rectangle above the cuirent region, together with a threshold: 

Ts= AEs*max(7-AUs, 5)/10 

where AUs is the number of 2*W rectangles that have been added above the initial strip 
and AEs is the average energy per 2*W rectangle included in the current region. If the 

10 tested 2*W rectangle has an energy that is larger than a threshold, merge it into the 
current region. Otherwise, a stop point has been found on this side. In a similar, 
symmetric feshion, check the 2*W rectangle below the current region. Continui 
growing first above and then below, until stop points are found on both sides or the 
height or the current region has reached 32, (whichever comes first). In the end we 

15 obtain a rectangle 77 that has dimension 2n*2m, l<=n,m<=16. 

With further reference to Figure 4, the process for finding the closest matched basis 
from the given dictionary is illustrated (block 62). The degree of matching between a 
basis and the residual image is represented by the absolute value (modulus) of their inner 

20 product, which is called the atom modulus, wherein a large modulus implies a good 
match. The process of determining this modulus requires computing a number of inner 
products, and selecting the one with the largest modulus as the current atom. This 
process can be the slowest part of the matching pursuit algorithm. In the classical 
matching pursuit algorithm, the inner product between the residual image and each of 

25 the millions of elements in the dictionary would need to be computed to determine the 
modulus. In the prior art for example, the 16*16 block with the highest energy in the 
residual image is simply selected as the initial search region - each basis structure is 
centered at each location in the chosen block, and the inner product between the basis 
structure and the corresponding residual region will be computed. For a dictionaiy with 

30 400 basis, this process requires 256x400=102400 inner product calculations. Figure 9 
illustrates the new matching pursuit process according to the present invention. 
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The resulting RESA rectangle 77 in Figure 8 provides an initial estimation for the shape 
of the high-energy feature. It is used to filter out bases in the dictionaiy that have a 
shape that is too different fiom the RESA rectangle. A subset of matching basis 
candidates (block 80) is then formed. Assume the width and height of rectangle 77 is w 
5 and h respectively, a subKUctionao^ is fonned containing all bases with shapes, specified 
by width and Iteig/u respectively, that satisfies: 
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w-twl<=width<^-ttw2 and h-thl<^height<=h+th2 

whe,^ twl,tw2,thl and th2 are values set to confine the basis size. These values may be 
changed and adjusted according to the dictionary stnicture. The largest and smallest 
sizes of tested bases are illustrated as rectangle 90 and 91 illustrated in Figure 10 For 
example, block B80 is a simple sub-dictionary example containing four bases. 

RESA can further estimate the location of high-energy features m the residual hnage 
The position candidates for matching bases are selected around the center of the RESA 
rectangle 77 (block 81). Figure 10 shows a small rectangle 92 whose center is the same 
as RESA rectangle 77. It is supposed that all pixels within rectangle 92 will work as a 
center for the tested residual region. Rectangle 94 in Figure 10 is an example whose 
center is point 93, or the left-top comer of rectangle 92. The width (ws) and height (hs) 
of rectangle 92 is supposed to be variable with RESA rectangle 77. The relationship is: 

ws=2*min(w/2 + 1,6) and hs^*min(h/2+l,6) 

The size of rectangle 92 can be decided by other rules or simply be fixed in an 
implementation. The basic idea is that a good match is located around the center of the 
RESA rectangle 77. Furthermore, any positions within rectangle 92 that already contain 
the center of an atom wiU not be considered for any new atoms. Point 95 in Figure 10 is 
an example. It should be noted that the prior art does not place such a restriction. The 
Idea for this type of restriction is tiiat if one atom provides a good fit, it should remove 
the energy aiound its center without introducing too much extra energy at its boundary 
As such it is not desired for the matching pursuit algorithm to return to the same position 
to pioduce a second atom. This restriction of forcing no position repetition has almost 
no effect on coding performance and can make the coding of the atom position 
mformation simpler. 
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The next processing step (block 89 in Figure 9) is called the progressive elimination 
algorithm (PEA) for the residual matching pursuit. It is independent of the method used 
to form the testing basis sub-dictionary and set of testing positions. For example, PEA 
5 will still operate if the sub-dictionary is the entire dictionary, and the set of position 
candidates is the set of coordinates comprising the whole residual image. PEA is a 
method of finding the closest matching basis more efficiently by progressively removing 
comparison candidates from consideration. This contrasts with classical matching 
pursuit, which compares all basis candidates at all possible positions. Initially the 
maximum modulus Mm is set to be zero (block 82). Next a basis b(k,l) is considered 
(block 83), where k and 1 represents the width and height of the 2D basis function. A 
same sized region centered at one position candidate r(k,l,p) in the residual image is 
formed (block 84). Block 85 compares ||r(k,l,p)||, the energy of r(k,l,p), with the cuirent 
maximum modulus (Mm) to decide if there is a need to calculate the inner product 
between r(k,l,p) and b(k,l). Tn order to explain this operation, recall the mathematical 
triangle inequality: 

|<r(k,l,p),b(k,l)>| <^||r(k,l,p)||||b(k,l)|| 

The objective of matching pursuit is to find tiie maximum l<r(k,l,p),b(k,l)>|. Assume the 
current maximum modulus is Mm. I^ for basis b(k,l) at position p, the correspondmg 
residual r(k,l,p) satisfies ||r(k4,p)|| ||b(k,l)||<=Mm, then: 

|<*(k,I.p),b(k,l)>| <^||r(k4,p)|I ||b(k.l)||<=Mm 

In this case, it is unnecessary to calculate the inner product <rOc,l,p),b(k,l)>, and the 
region r(k,l,p) is moved to the next position. The norm of basis ||bOc,l)|| can be 
calculated a priori (actually most of the basis are normalized, namely ||b(k,l)||=l), tiie 
only overhead for this test then is to calculate the energy of r(k,l,p). An effective 
algorithm to determine ||r(k4,p)||, is described below. 

Assume there are n different sizes of basis heights {vi, V2, .... v„}, and m different sizes 

of basis widths {hi, ha hn,), tfiat are increasingly ordered. The search rectangle 

dimension is hs*ws, and the left-top pomt of the search rectangle is p(x,y). The 
hs*ws*n*m energy values can be calculated through the following four steps: 

12 
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Step 1: Calculate the energy for the s=h„+k columns (Figure 11 shows an example of 
the columns). These columns are centered at (x-h„^+i.y). i=0.1,....s.l. TTieir height 
vi. Their energy is represented as C,.o(0).Cu(0)....C,^(0), and calculated 



IS 

as: 



5 CjXO) = e(x-h„/2+i.y-v,/2) +...+e(x-IW2+i,y)+...+ e(x-h„^+i,y+v,/2) 

where e(x,y) represents the energy of pixels at position (x,y). 

TTie energies for the next s columns with same coordinates as above strips and length 
can be computed as: 

C2.i(0) = Ci.i(0) + Extra (vj-vi) Pixels Energy, i=l,2, . . .s 
Generally, we have: 

Cj.i(0) = Cj.i.i(0) + Extra (Vj-V(j.,)) Pixels Energy, i=l,2,...s; j=l,2,...n 
Step 2: Calculate energy of colunms that are vertical shift of columns m Step 1. using: 

Cj.i(a)= Cj.i(a-1)- e(x-h„/2+i,y-vi/2+a-l) + e(x-h^+i,y+vi/2+a),a=l,. . .,hs 
15 where a represents the vertical shift number corresponding to y. 

Step 3: Calculate the energies of regions with height vj, 0=1,.. .,n) and width h,, ha, .... 
hm and center (x,y+a), (v=0,l, . . „ hs) using: 

Sj,i(0,a)= Cj,(hm-hi V2(a)+. . .+Cj.hmc(a)+. . .-K:j.(h™+hiy2(a) 
Sj^(0,a)= Sj.,(0,a)+ Extra (ha - hi) columns' energy 

Generally, 

Si.i(0.a)= Sj.i.i(0,a)+ Extra (h; - h(i.,)) columns* energy, i=l,. . .,m 
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Step 4: Calculate the energies of the first set of regions with vertical base length vj, a=l, 
.... n) and horizontal base length hi, (i=l,..,m) and center (x+b.y+a), (b=l....,ws arid 
a'=l,...,hs) using: 

Sj.i(b,a)= Sj,i(b-l,a) - Cj.(ta„-hiy2+b.i(a)4Ci.(h,^hiy2+b(a) 

5 The maximum modulus can be updated successively during the matching pursuit 
process; this can progressively confine the search space. Several bases can have the 
same sizes, thus one energy calculation may avoid several inner product calculations. 
The performance of PEA is also related with how fast a good match (not necessarily the 
best match) is found. Because large regions always contain more energy, bases of larger 
1 0 dimension are tested first. 

If ||r(k,l,p)||>Mm. block 86 is executed to calculate the inner product (p) between r(k,l,p) 
and b(k,l). Block 87 compares the absolute value bf p with current maximum modulus 
Mm. If |p|>Mm, the new Mm is set as |p| and the corresponding basis index and position 
are recorded. Regardless, we keep returning to block 84 until all search positions have 
been checked. Then blocks 83 through 88 are run repeatedly until all basis candidates 
have been tested. Fmally, an atom is produced which includes three parameters: 1. The 
index of basis in the dictionary that gives Ihe best match; 2. The location of the best 
match in the residual image with (x, y) coordinates; and 3. The inner product (p) 
20 between the basis and the residual image. Figure 12 shows an example of an atom on a 
residual image. 

With further reference to Figure 4, the step after finding an atom is to record the atom 
parameters (block 63). Note in this stage, no quantization of the atom's modulus is 

.25 performed. Decision block 64 will decide when to begin atom quantization. Its 
operation depends on the rate control goal defmed by the video compression system. If 
the compression ratio is fixed, block 64 will check if bits are still available for more 
atoms. Because no actual coding has been done yet, the used bits for coding the current 
atoms has to be estimated. Let «Bip" represent the average bits for coding the basis 

30 indices and positions, '«Bm(i)» represents the actual bits for the ifh atom's modulus 
without quantization. Allocating one bit for the sign of inner product (p), then the used 
bits for n atoms are estimated as: 
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Used Bits = n*(Bip+l) +IXBm(l) + Bm(2) + , . ,Bm(ii)) 

where '"Bip" is initialized according to experiential data for a first residual frame, and set 
as real value of last frame. Bm(i) can be known exactly for each modulus. An important 
feet is that the modulus will be quantized later and will result in fewer bits to be used 
5 than currently estimated. Thus in this stage, there will typically be fewer atoms than 
what can coded. If the video system wants to achieve a certain quality, which is defined 
by the mean square error (MSB) of the coded residual unage as compared to the actual 
residual unage, block 64 will compare the current MSB achieved with the MSB 
objective. The MSB after introducing one atom is updated according to foUowmg 
10 equation: 

MSE(n) = MSE(n-l) - p(n)*p(n) 

where MSE(n) represents the MSB after using n atoms and p(n) represents the inner 
product of nth atom. Initially the MSB, or MSB(0), is set to the energy of original 
residual unage. Afl»r quantization is performed, MSB(n) will likely increase, and 

1 5 therefore will no longer achieve the MSB objective. In summaiy, if bits are available or 
the quality goal has not been achieved, the residual image will be updated based on the 
current atom (block 65), followed by a search for another atom recommencing at 
block 61. Otherwise if the bit or quality objective has been achieved; block 66 is 
executed for the quantization design. Residual image iqjdating, one step for fte standard 

20 matching pursuit algorithm, can be described mathematically as: 

r(k,l,p) = r(k,l,p) - p(n)*b(k,l) 
All regions not covered by the current atom will be unchanged. 

The design of the quantizer (block 66) is based on the minimum modulus (Minm) value 
25 found so far. The quantization step size (QS) is set to: 

'• 32 if Minm>24; 

J 16 if2<Minm<=24; 

8 if6<Minm<=12 

^ 4 ifMmm<=6; 
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All atoms found up to this point, will be quantized using the above QS in the simple 
mid-read scalar quantization scheme. Next the residual image is updated agam 
according to the now quantized list of atom moduli 67. Assume that the atom 
coeflFicient before and after quantization are p(i), q(i) respectively (i=l,...,n). Assume 
5 that the corresponding bases are b(i), (i=l,...,n). The residual image after n unquantized 
atoms is: 

E(n) = (Original Residual) - p(l)b(l) - p(2)b(2) p(n)b(n) 

Its energy ||E(n)|| is known also. There are two ways to calculate the residual energy 
after quantization. The first way is to simply calculate the residual image after 
10 quantization as: 

EQ(n) = (Origmal Residual) - q(l)b(l) - q(2)b(2) ...... q(n)b(n) 

Another way is to update it recursively. Assume the quantization error for p(i) is Ap(i). 
Then the residual image with only p(n) being quantized is: 

EQ(1) = E(n) - Ap(n)b(n) and ||EQ(1)|| = l|E(n)|| + Ap(n)* Ap(n) - 2Ap(n)<E(n), b(n)> 
15 The residual with the quantization of p(n) and p(n-l) becomes: 

EQ(2) = EQ (1) - Ap(n.l)g(n-1) 
This relationship is true recursively and can be written as: 

EQ (i) = EQ (i-1) - Ap(n.i+l)g(n-i+l),i=l,2,. . .n, EQ(0) = E(n) 
The corresponding energy is: 

20 II EQ(i)|| =11 EQ(i-l)|| + Ap(n.i+1) Ap(n.i+1) - 2* Ap(n-i+l)< EQ(i-l),g(n-i+l)> 

Finally, we will get EQ(n) and t|EQ(n)||, which is the start point for fiirther atom finding. 
An important thing is that the list of atoms can be in any order for the recursive update 
to occur - the update does not need to occur in the order in which the atoms were found. 
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Because the moduli of atoms have been quantized, more atoms will now be necessary to 
achieve the rate control or quality objective. Therefore, block 68 is executed to find 
additional atoms. The process is the same as block 61 through 63. However, the atom 
moduli will be quantized immediately in this stage. We now need to deal with atoms 
5 whose moduli is smaller than (QS - QS/4), without throwing them out by setting their 
quantization value to zero. The scheme used is given below: 

1 . If the atom modulus is larger than (QS - QS/4) then quantizer is using QS; 

2. Otherwise, if the atom modulus is larger than (QS/2-QS/8) then it is quantized as 
10 value QS/2; 

3. Otherwise, if the atom modulus is larger than (QS/4-QS/16) then it is quantized 
as value QS/4; 

4. Otherwise, if the atom modulus is larger than (QS/8-QS/32) then it is quantized 
as value QS/8. 

15 In practice, three levels dovra is typically sufBcient, although more levels may be used. 

After block 68, a real rate control logic unit is executed (block 69). Because the atoms 
are quantized in-loop m this stage, the achieved quality or actual number of bits used can 
be estimated. When the compression goal is achieved, the system will go into the atom 
encoder 42. Otherwise, the residual image wiU be updated based on the quantized atom 
modulus and the system will return to block 68 to find the next atom. For colour video, 
a residual image contains several chamiels, i.e. Y, U and V channels. The atom 
decomposer 40 will be used for each channel independently. With this scheme, each 
channel can have its own bit budget or desired quality goal. There are certain bit 
allocation methods, which can be used to aUocate bit budgets for the different channels. 
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All the atoms are passed to the atom encoder 42 for output in a compressed form. The 
present invention considers the atom distribution for each channel as a bi-value map, as 
illustrated in Figure 13. The black pkels represent atoms in their corresponding 
30 position, while the white pixels represent a lack of atoms in that position. A quadtree- 
like technique can be used to encode the positions containing atoms, although other 
techniques may be used as would be readily understood. The other parametere of each 
atom can be encoded after the atoms position information, using variable length coding, 
for example, however other encoding techniques may be used as would be know to a 
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worker skilled in the art. The coding procedure for the atom parameter signal is 
illustrated in Figure 14 and described in more detail below. 

The first step of atom encoding is to decompose the whole atom map, for example as 
5 illustrated in Figure 1 3, into n*n blocks (Block 101). The value n may either be 16 (for 
the Y channel) or 8 (for the U and V channels). For each n*n block, if there are no 
atoms in the block, a zero-bit is output; otherwise, a one-bit is output, and the block is 
processed further to locate the atoms to the decoder. A quadtree decomposition 
procedure is used for this, and is summarized in the following four steps: 

10 

Step 1 . Initialize a list of atom blocks (LAB) with one element - the n*n block itself. 
Step 2. Pick one element e from LAB. If e's size is 1*1, output all atom parameters 
except for the position: namely the basis index, modulus and the sign of the inner 
product of e should be output, then proceed to step 4; otherwise,, proceed to step 
15 3. 

Step 3. Output the atom pattern bits of the four sub-blocks of e: aja2a3a4, where sa 
0=1,2,3,4) is one if there is an atom in the corresponding sub-block and zero 
otherwise. Put all sub-blocks / with ai value equal to 1 mto the end of the LAB 
and return to step 2. 

20 Step 4. Check if the LAB is empty. If it is not empty, return to step 2; otherwise the 
encoding finishes for the one n*n block. 

The basis index and atom modulus can be coded using a variable length coder to 
conserve bits, since these signal parameters may not be uniformly distributed. The atom 

25 position information can be encoded implicitly by recording the decomposition 
procedure with the 0/1 bit data. A variable length coding method can be used to encode 
the atom pattern bits of the four sub-blocks: ala2a3a4. There are 15 kinds of patterns 
for the atom pattern bits, ala2a3a4, wherein it should be noted that 0000 is impossible. 
However, some patterns, such as 1000, occur with a much higher probability than other 

30 patterns. The probability of the different patterns can be estimated through experiments 
and used to create a variable length table design. Further, it should be noted that the 
probability distribution can be variable for different channels and different atom 
densities. Therefore multiple tables can be used, and the block's category information 
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can be encoded first so the decoder knows which table should be used for decoding 
purposes. 

Figure 15 illustrates the atom decoder 46, which perfonns operations that are the inverse 
5 of those performed by the atom encoder 42. First, the atom decoder 46 receives one bit 
representing the status for the current n*n block. If the value is one. it is processed 
through the symmetric quadtree decomposition procedure. Initially, the n*n block is 
divided into four sub-blocks. The atom pattern bits for the four sub-blocks are decoded 
using inverse variable length coding (VLC). Then all the sub-blocks with value 1 are 
put into a list of atom blocks (LAB). The LAB is updated dynamically by decomposmg 
each element in the LAB recursively and getting its atom pattem bits. If an element 
from the LAB is a 1*1 block, the atom basis index and the modulus should be decoded 
using the inverse VLC tables; the bit representing the sign of the inner product should 
then be read in. The atom decoder for one n*n block is finished if the LAB becomes 
15 empty. 

The decoded atom parameter signal is then passed to the residual re-constructor 48. 
which forms the residual image one channel by one channel using the method of 
classical matching pursuit Initially all pfacels on the residual image are set to zero. 

20 Then each atom is added one by one using the foUowing procedure: Let q(i) and b(i,k.l) 
represent the i'th atom coefBcient and the corresponding 2D basis matrix respectively. 
If (x(i). y(i)) represents the location of the i'th atom, then the matrix q(i)*b(i,k.l) is 
added to the residual unage constructed so far at position (x(i). y(i)) to get the new 
current residual image. The process repeats until all atoms have been added for the 

25 channel. Once each channel has been decomposed, the process is finished and the 
residual image has been reconstructed. 
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Those femiliar with the previous matching pursuit based video coding art will recognize 
a number of advantages associated with the techniques according to the present 
invention. The atom decomposition process based on an over-complete basis space has 
been sped up flirough a more accurate energy region estimation procedure and through 
the progressive candidate elimination algorithm. The atom modulus quantizer design is 
seamlessly chosen by the atom decomposition scheme, while the previous art specified 
the quantizer before the transformation began. Finally, the atom encoding process is 
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more efficient because spatial relationships between the atoms are exploited by the 
invented quadtree-based decomposition scheme. In particular, the prior art collects aU 
atoms into a ID list thereby making it harder to efficiently code them, when compared to 
the present invention. 

5 

The embodiments of the invention being thus described, it will be obvious that the same 
may be varied in many ways. Such variations are not to be regarded as a departure fix>m 
the spirit and scope of the invention, and ail such modifications as would be obvious to 
one skilled in the art are intended to be mcluded within the scope of the following 
10 claims. 
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